Search Results: "ecki"

13 December 2023

Melissa Wen: 15 Tips for Debugging Issues in the AMD Display Kernel Driver

A self-help guide for examining and debugging the AMD display driver within the Linux kernel/DRM subsystem. It s based on my experience as an external developer working on the driver, and are shared with the goal of helping others navigate the driver code. Acknowledgments: These tips were gathered thanks to the countless help received from AMD developers during the driver development process. The list below was obtained by examining open source code, reviewing public documentation, playing with tools, asking in public forums and also with the help of my former GSoC mentor, Rodrigo Siqueira.

Pre-Debugging Steps: Before diving into an issue, it s crucial to perform two essential steps: 1) Check the latest changes: Ensure you re working with the latest AMD driver modifications located in the amd-staging-drm-next branch maintained by Alex Deucher. You may also find bug fixes for newer kernel versions on branches that have the name pattern drm-fixes-<date>. 2) Examine the issue tracker: Confirm that your issue isn t already documented and addressed in the AMD display driver issue tracker. If you find a similar issue, you can team up with others and speed up the debugging process.

Understanding the issue: Do you really need to change this? Where should you start looking for changes? 3) Is the issue in the AMD kernel driver or in the userspace?: Identifying the source of the issue is essential regardless of the GPU vendor. Sometimes this can be challenging so here are some helpful tips:
  • Record the screen: Capture the screen using a recording app while experiencing the issue. If the bug appears in the capture, it s likely a userspace issue, not the kernel display driver.
  • Analyze the dmesg log: Look for error messages related to the display driver in the dmesg log. If the error message appears before the message [drm] Display Core v... , it s not likely a display driver issue. If this message doesn t appear in your log, the display driver wasn t fully loaded and you will see a notification that something went wrong here.
4) AMD Display Manager vs. AMD Display Core: The AMD display driver consists of two components:
  • Display Manager (DM): This component interacts directly with the Linux DRM infrastructure. Occasionally, issues can arise from misinterpretations of DRM properties or features. If the issue doesn t occur on other platforms with the same AMD hardware - for example, only happens on Linux but not on Windows - it s more likely related to the AMD DM code.
  • Display Core (DC): This is the platform-agnostic part responsible for setting and programming hardware features. Modifications to the DC usually require validation on other platforms, like Windows, to avoid regressions.
5) Identify the DC HW family: Each AMD GPU has variations in its hardware architecture. Features and helpers differ between families, so determining the relevant code for your specific hardware is crucial.
  • Find GPU product information in Linux/AMD GPU documentation
  • Check the dmesg log for the Display Core version (since this commit in Linux kernel 6.3v). For example:
    • [drm] Display Core v3.2.241 initialized on DCN 2.1
    • [drm] Display Core v3.2.237 initialized on DCN 3.0.1

Investigating the relevant driver code: Keep from letting unrelated driver code to affect your investigation. 6) Narrow the code inspection down to one DC HW family: the relevant code resides in a directory named after the DC number. For example, the DCN 3.0.1 driver code is located at drivers/gpu/drm/amd/display/dc/dcn301. We all know that the AMD s shared code is huge and you can use these boundaries to rule out codes unrelated to your issue. 7) Newer families may inherit code from older ones: you can find dcn301 using code from dcn30, dcn20, dcn10 files. It s crucial to verify which hooks and helpers your driver utilizes to investigate the right portion. You can leverage ftrace for supplemental validation. To give an example, it was useful when I was updating DCN3 color mapping to correctly use their new post-blending color capabilities, such as: Additionally, you can use two different HW families to compare behaviours. If you see the issue in one but not in the other, you can compare the code and understand what has changed and if the implementation from a previous family doesn t fit well the new HW resources or design. You can also count on the help of the community on the Linux AMD issue tracker to validate your code on other hardware and/or systems. This approach helped me debug a 2-year-old issue where the cursor gamma adjustment was incorrect in DCN3 hardware, but working correctly for DCN2 family. I solved the issue in two steps, thanks for community feedback and validation: 8) Check the hardware capability screening in the driver: You can currently find a list of display hardware capabilities in the drivers/gpu/drm/amd/display/dc/dcn*/dcn*_resource.c file. More precisely in the dcn*_resource_construct() function. Using DCN301 for illustration, here is the list of its hardware caps:
	/*************************************************
	 *  Resource + asic cap harcoding                *
	 *************************************************/
	pool->base.underlay_pipe_index = NO_UNDERLAY_PIPE;
	pool->base.pipe_count = pool->base.res_cap->num_timing_generator;
	pool->base.mpcc_count = pool->base.res_cap->num_timing_generator;
	dc->caps.max_downscale_ratio = 600;
	dc->caps.i2c_speed_in_khz = 100;
	dc->caps.i2c_speed_in_khz_hdcp = 5; /*1.4 w/a enabled by default*/
	dc->caps.max_cursor_size = 256;
	dc->caps.min_horizontal_blanking_period = 80;
	dc->caps.dmdata_alloc_size = 2048;
	dc->caps.max_slave_planes = 2;
	dc->caps.max_slave_yuv_planes = 2;
	dc->caps.max_slave_rgb_planes = 2;
	dc->caps.is_apu = true;
	dc->caps.post_blend_color_processing = true;
	dc->caps.force_dp_tps4_for_cp2520 = true;
	dc->caps.extended_aux_timeout_support = true;
	dc->caps.dmcub_support = true;
	/* Color pipeline capabilities */
	dc->caps.color.dpp.dcn_arch = 1;
	dc->caps.color.dpp.input_lut_shared = 0;
	dc->caps.color.dpp.icsc = 1;
	dc->caps.color.dpp.dgam_ram = 0; // must use gamma_corr
	dc->caps.color.dpp.dgam_rom_caps.srgb = 1;
	dc->caps.color.dpp.dgam_rom_caps.bt2020 = 1;
	dc->caps.color.dpp.dgam_rom_caps.gamma2_2 = 1;
	dc->caps.color.dpp.dgam_rom_caps.pq = 1;
	dc->caps.color.dpp.dgam_rom_caps.hlg = 1;
	dc->caps.color.dpp.post_csc = 1;
	dc->caps.color.dpp.gamma_corr = 1;
	dc->caps.color.dpp.dgam_rom_for_yuv = 0;
	dc->caps.color.dpp.hw_3d_lut = 1;
	dc->caps.color.dpp.ogam_ram = 1;
	// no OGAM ROM on DCN301
	dc->caps.color.dpp.ogam_rom_caps.srgb = 0;
	dc->caps.color.dpp.ogam_rom_caps.bt2020 = 0;
	dc->caps.color.dpp.ogam_rom_caps.gamma2_2 = 0;
	dc->caps.color.dpp.ogam_rom_caps.pq = 0;
	dc->caps.color.dpp.ogam_rom_caps.hlg = 0;
	dc->caps.color.dpp.ocsc = 0;
	dc->caps.color.mpc.gamut_remap = 1;
	dc->caps.color.mpc.num_3dluts = pool->base.res_cap->num_mpc_3dlut; //2
	dc->caps.color.mpc.ogam_ram = 1;
	dc->caps.color.mpc.ogam_rom_caps.srgb = 0;
	dc->caps.color.mpc.ogam_rom_caps.bt2020 = 0;
	dc->caps.color.mpc.ogam_rom_caps.gamma2_2 = 0;
	dc->caps.color.mpc.ogam_rom_caps.pq = 0;
	dc->caps.color.mpc.ogam_rom_caps.hlg = 0;
	dc->caps.color.mpc.ocsc = 1;
	dc->caps.dp_hdmi21_pcon_support = true;
	/* read VBIOS LTTPR caps */
	if (ctx->dc_bios->funcs->get_lttpr_caps)  
		enum bp_result bp_query_result;
		uint8_t is_vbios_lttpr_enable = 0;
		bp_query_result = ctx->dc_bios->funcs->get_lttpr_caps(ctx->dc_bios, &is_vbios_lttpr_enable);
		dc->caps.vbios_lttpr_enable = (bp_query_result == BP_RESULT_OK) && !!is_vbios_lttpr_enable;
	 
	if (ctx->dc_bios->funcs->get_lttpr_interop)  
		enum bp_result bp_query_result;
		uint8_t is_vbios_interop_enabled = 0;
		bp_query_result = ctx->dc_bios->funcs->get_lttpr_interop(ctx->dc_bios, &is_vbios_interop_enabled);
		dc->caps.vbios_lttpr_aware = (bp_query_result == BP_RESULT_OK) && !!is_vbios_interop_enabled;
	 
Keep in mind that the documentation of color capabilities are available at the Linux kernel Documentation.

Understanding the development history: What has brought us to the current state? 9) Pinpoint relevant commits: Use git log and git blame to identify commits targeting the code section you re interested in. 10) Track regressions: If you re examining the amd-staging-drm-next branch, check for regressions between DC release versions. These are defined by DC_VER in the drivers/gpu/drm/amd/display/dc/dc.h file. Alternatively, find a commit with this format drm/amd/display: 3.2.221 that determines a display release. It s useful for bisecting. This information helps you understand how outdated your branch is and identify potential regressions. You can consider each DC_VER takes around one week to be bumped. Finally, check testing log of each release in the report provided on the amd-gfx mailing list, such as this one Tested-by: Daniel Wheeler:

Reducing the inspection area: Focus on what really matters. 11) Identify involved HW blocks: This helps isolate the issue. You can find more information about DCN HW blocks in the DCN Overview documentation. In summary:
  • Plane issues are closer to HUBP and DPP.
  • Blending/Stream issues are closer to MPC, OPP and OPTC. They are related to DRM CRTC subjects.
This information was useful when debugging a hardware rotation issue where the cursor plane got clipped off in the middle of the screen. Finally, the issue was addressed by two patches: 12) Issues around bandwidth (glitches) and clocks: May be affected by calculations done in these HW blocks and HW specific values. The recalculation equations are found in the DML folder. DML stands for Display Mode Library. It s in charge of all required configuration parameters supported by the hardware for multiple scenarios. See more in the AMD DC Overview kernel docs. It s a math library that optimally configures hardware to find the best balance between power efficiency and performance in a given scenario. Finding some clk variables that affect device behavior may be a sign of it. It s hard for a external developer to debug this part, since it involves information from HW specs and firmware programming that we don t have access. The best option is to provide all relevant debugging information you have and ask AMD developers to check the values from your suspicions.
  • Do a trick: If you suspect the power setup is degrading performance, try setting the amount of power supplied to the GPU to the maximum and see if it affects the system behavior with this command: sudo bash -c "echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level"
I learned it when debugging glitches with hardware cursor rotation on Steam Deck. My first attempt was changing the clock calculation. In the end, Rodrigo Siqueira proposed the right solution targeting bandwidth in two steps:

Checking implicit programming and hardware limitations: Bring implicit programming to the level of consciousness and recognize hardware limitations. 13) Implicit update types: Check if the selected type for atomic update may affect your issue. The update type depends on the mode settings, since programming some modes demands more time for hardware processing. More details in the source code:
/* Surface update type is used by dc_update_surfaces_and_stream
 * The update type is determined at the very beginning of the function based
 * on parameters passed in and decides how much programming (or updating) is
 * going to be done during the call.
 *
 * UPDATE_TYPE_FAST is used for really fast updates that do not require much
 * logical calculations or hardware register programming. This update MUST be
 * ISR safe on windows. Currently fast update will only be used to flip surface
 * address.
 *
 * UPDATE_TYPE_MED is used for slower updates which require significant hw
 * re-programming however do not affect bandwidth consumption or clock
 * requirements. At present, this is the level at which front end updates
 * that do not require us to run bw_calcs happen. These are in/out transfer func
 * updates, viewport offset changes, recout size changes and pixel
depth changes.
 * This update can be done at ISR, but we want to minimize how often
this happens.
 *
 * UPDATE_TYPE_FULL is slow. Really slow. This requires us to recalculate our
 * bandwidth and clocks, possibly rearrange some pipes and reprogram
anything front
 * end related. Any time viewport dimensions, recout dimensions,
scaling ratios or
 * gamma need to be adjusted or pipe needs to be turned on (or
disconnected) we do
 * a full update. This cannot be done at ISR level and should be a rare event.
 * Unless someone is stress testing mpo enter/exit, playing with
colour or adjusting
 * underscan we don't expect to see this call at all.
 */
enum surface_update_type  
UPDATE_TYPE_FAST, /* super fast, safe to execute in isr */
UPDATE_TYPE_MED,  /* ISR safe, most of programming needed, no bw/clk change*/
UPDATE_TYPE_FULL, /* may need to shuffle resources */
 ;

Using tools: Observe the current state, validate your findings, continue improvements. 14) Use AMD tools to check hardware state and driver programming: help on understanding your driver settings and checking the behavior when changing those settings.
  • DC Visual confirmation: Check multiple planes and pipe split policy.
  • DTN logs: Check display hardware state, including rotation, size, format, underflow, blocks in use, color block values, etc.
  • UMR: Check ASIC info, register values, KMS state - links and elements (framebuffers, planes, CRTCs, connectors). Source: UMR project documentation
15) Use generic DRM/KMS tools:
  • IGT test tools: Use generic KMS tests or develop your own to isolate the issue in the kernel space. Compare results across different GPU vendors to understand their implementations and find potential solutions. Here AMD also has specific IGT tests for its GPUs that is expect to work without failures on any AMD GPU. You can check results of HW-specific tests using different display hardware families or you can compare expected differences between the generic workflow and AMD workflow.
  • drm_info: This tool summarizes the current state of a display driver (capabilities, properties and formats) per element of the DRM/KMS workflow. Output can be helpful when reporting bugs.

Don t give up! Debugging issues in the AMD display driver can be challenging, but by following these tips and leveraging available resources, you can significantly improve your chances of success. Worth mentioning: This blog post builds upon my talk, I m not an AMD expert, but presented at the 2022 XDC. It shares guidelines that helped me debug AMD display issues as an external developer of the driver. Open Source Display Driver: The Linux kernel/AMD display driver is open source, allowing you to actively contribute by addressing issues listed in the official tracker. Tackling existing issues or resolving your own can be a rewarding learning experience and a valuable contribution to the community. Additionally, the tracker serves as a valuable resource for finding similar bugs, troubleshooting tips, and suggestions from AMD developers. Finally, it s a platform for seeking help when needed. Remember, contributing to the open source community through issue resolution and collaboration is mutually beneficial for everyone involved.

4 December 2023

Russ Allbery: Cumulative haul

I haven't done one of these in quite a while, long enough that I've already read and reviewed many of these books. John Joseph Adams (ed.) The Far Reaches (sff anthology)
Poul Anderson The Shield of Time (sff)
Catherine Asaro The Phoenix Code (sff)
Catherine Asaro The Veiled Web (sff)
Travis Baldree Bookshops & Bonedust (sff)
Sue Burke Semiosis (sff)
Jacqueline Carey Cassiel's Servant (sff)
Rob Copeland The Fund (nonfiction)
Mar Delaney Wolf Country (sff)
J.S. Dewes The Last Watch (sff)
J.S. Dewes The Exiled Fleet (sff)
Mike Duncan Hero of Two Worlds (nonfiction)
Mike Duncan The Storm Before the Storm (nonfiction)
Kate Elliott King's Dragon (sff)
Zeke Faux Number Go Up (nonfiction)
Nicola Griffith Menewood (sff)
S.L. Huang The Water Outlaws (sff)
Alaya Dawn Johnson The Library of Broken Worlds (sff)
T. Kingfisher Thornhedge (sff)
Naomi Kritzer Liberty's Daughter (sff)
Ann Leckie Translation State (sff)
Michael Lewis Going Infinite (nonfiction)
Jenna Moran Magical Bears in the Context of Contemporary Political Theory (sff collection)
Ari North Love and Gravity (graphic novel)
Ciel Pierlot Bluebird (sff)
Terry Pratchett A Hat Full of Sky (sff)
Terry Pratchett Going Postal (sff)
Terry Pratchett Thud! (sff)
Terry Pratchett Wintersmith (sff)
Terry Pratchett Making Money (sff)
Terry Pratchett Unseen Academicals (sff)
Terry Pratchett I Shall Wear Midnight (sff)
Terry Pratchett Snuff (sff)
Terry Pratchett Raising Steam (sff)
Terry Pratchett The Shepherd's Crown (sff)
Aaron A. Reed 50 Years of Text Games (nonfiction)
Dashka Slater Accountable (nonfiction)
Rory Stewart The Marches (nonfiction)
Emily Tesh Silver in the Wood (sff)
Emily Tesh Drowned Country (sff)
Valerie Vales Chilling Effect (sff)
Martha Wells System Collapse (sff)
Martha Wells Witch King (sff)

23 November 2023

Bits from Debian: archive.debian.org rsync address change

The proposed and previously announced changes to the rsync service have become effective with the rsync://archive.debian.org address now being discontinued. The worldwide Debian mirrors network has served archive.debian.org via both HTTP and rsync. As part of improving the reliability of the service for users, the Debian mirrors team is separating the access methods to different host names: rsync service on archive.debian.org has stopped, and we encourage anyone using the service to migrate to the new host name as soon as possible. If you are currently using rsync to the debian-archive from a debian.org server that forms part of the archive.debian.org rotation, we also encourage Administrators to move to the new service name. This will allow us to better manage which back-end servers offer rsync service in future. Note that due to its nature the content of archive.debian.org does not change frequently - generally there will be several months, possibly more than a year, between updates - so checking for updates more than once a day is unnecessary. For additional information please reach out to the Debian Mirrors Team maillist.

10 October 2023

Matthias Klumpp: How to indicate device compatibility for your app in MetaInfo data

At the moment I am hard at work putting together the final bits for the AppStream 1.0 release (hopefully to be released this month). The new release comes with many new new features, an improved developer API and removal of most deprecated things (so it carefully breaks compatibility with very old data and the previous C API). One of the tasks for the upcoming 1.0 release was #481 asking about a formal way to distinguish Linux phone applications from desktop applications. AppStream infamously does not support any is-for-phone label for software components, instead the decision whether something is compatible with a device is based the the device s capabilities and the component s requirements. This allows for truly adaptive applications to describe their requirements correctly, and does not lock us into form factors going into the future, as there are many and the feature range between a phone, a tablet and a tiny laptop is quite fluid. Of course the match to current device capabilities check does not work if you are a website ranking phone compatibility. It also does not really work if you are a developer and want to know which devices your component / application will actually be considered compatible with. One goal for AppStream 1.0 is to have its library provide more complete building blocks to software centers. Instead of just a here s the data, interpret it according to the specification API, libappstream now interprets the specification for the application and provides API to handle most common operations like checking device compatibility. For developers, AppStream also now implements a few virtual chassis configurations , to roughly gauge which configurations a component may be compatible with. To test the new code, I ran it against the large Debian and Flatpak repositories to check which applications are considered compatible with what chassis/device type already. The result was fairly disastrous, with many applications not specifying compatibility correctly (many do, but it s by far not the norm!). Which brings me to the actual topic of this blog post: Very few seem to really know how to mark an application compatible with certain screen sizes and inputs! This is most certainly a matter of incomplete guides and good templates, so maybe this post can help with that a bit:

The ultimate cheat-sheet to mark your app chassis-type compatible As a quick reminder, compatibility is indicated using AppStream s relations system: A requires relation indicates that the system will not run at all or will run terribly if the requirement is not met. If the requirement is not met, it should not be installable on a system. A recommends relation means that it would be advantageous to have the recommended items, but it s not essential to run the application (it may run with a degraded experience without the recommended things though). And a supports relation means a given interface/device/control/etc. is supported by this application, but the application may work completely fine without it.

I have a desktop-only application A desktop-only application is characterized by needing a larger screen to fit the application, and requiring a physical keyboard and accurate mouse input. This type is assumed by default if no capabilities are set for an application, but it s better to be explicit. This is the metadata you need:
<component type="desktop-application">
  <id>org.example.desktopapp</id>
  <name>DesktopApp</name>
  [...]
  <requires>
    <display_length>768</display_length>
    <control>keyboard</control>
    <control>pointing</control>
  </requires>
  [...]
</component>
With this requires relation, you require a small-desktop sized screen (at least 768 device-independent pixels (dp) on its smallest edge) and require a keyboard and mouse to be present / connectable. Of course, if your application needs more minimum space, adjust the requirement accordingly. Note that if the requirement is not met, your application may not be offered for installation.
Note: Device-independent / logical pixels One logical pixel (= device independent pixel) roughly corresponds to the visual angle of one pixel on a device with a pixel density of 96 dpi (for historical X11 reasons) and a distance from the observer of about 52 cm, making the physical pixel about 0.26 mm in size. When using logical pixels as unit, they might not always map to exact physical lengths as their exact size is defined by the device providing the display. They do however accurately depict the maximum amount of pixels that can be drawn in the depicted direction on the device s display space. AppStream always uses logical pixels when measuring lengths in pixels.

I have an application that works on mobile and on desktop / an adaptive app Adaptive applications have fewer hard requirements, but a wide range of support for controls and screen sizes. For example, they support touch input, unlike desktop apps. An example MetaInfo snippet for these kind of apps may look like this:
<component type="desktop-application">
  <id>org.example.adaptive_app</id>
  <name>AdaptiveApp</name>
  [...]
  <requires>
    <display_length>360</display_length>
  </requires>
  <supports>
    <control>keyboard</control>
    <control>pointing</control>
    <control>touch</control>
  </supports>
  [...]
</component>
Unlike the pure desktop application, this adaptive application requires a much smaller lowest display edge length, and also supports touch input, in addition to keyboard and mouse/touchpad precision input.

I have a pure phone/table app Making an application a pure phone application is tricky: We need to mark it as compatible with phones only, while not completely preventing its installation on non-phone devices (even though its UI is horrible, you may want to test the app, and software centers may allow its installation when requested explicitly even if they don t show it by default). This is how to achieve that result:
<component type="desktop-application">
  <id>org.example.phoneapp</id>
  <name>PhoneApp</name>
  [...]
  <requires>
    <display_length>360</display_length>
  </requires>
  <recommends>
    <display_length compare="lt">1280</display_length>
    <control>touch</control>
  </recommends>
  [...]
</component>
We require a phone-sized display minimum edge size (adjust to a value that is fit for your app!), but then also recommend the screen to have a smaller edge size than a larger tablet/laptop, while also recommending touch input and not listing any support for keyboard and mouse. Please note that this blog post is of course not a comprehensive guide, so if you want to dive deeper into what you can do with requires/recommends/suggests/supports, you may want to have a look at the relations tags described in the AppStream specification.

Validation It is still easy to make mistakes with the system requirements metadata, which is why AppStream 1.0 will provide more commands to check MetaInfo files for system compatibility. Current pre-1.0 AppStream versions already have an is-satisfied command to check if the application is compatible with the currently running operating system:
:~$ appstreamcli is-satisfied ./org.example.adaptive_app.metainfo.xml
Relation check for: */*/*/org.example.adaptive_app/*
Requirements:
   Unable to check display size: Can not read information without GUI toolkit access.
Recommendations:
   No recommended items are set for this software.
Supported:
   Physical keyboard found.
   Pointing device (e.g. a mouse or touchpad) found.
   This software supports touch input.
In addition to this command, AppStream 1.0 will introduce a new one as well: check-syscompat. This command will check the component against libappstream s mock system configurations that define a most common (whatever that is at the time) configuration for a respective chassis type. If you pass the --details flag, you can even get an explanation why the component was considered or not considered for a specific chassis type:
:~$ appstreamcli check-syscompat --details ./org.example.phoneapp.metainfo.xml
Chassis compatibility check for: */*/*/org.example.phoneapp/*
Desktop:
   Incompatible
   recommends: This software recommends a display with its shortest edge
   being << 1280 px in size, but the display of this device has 1280 px.
   recommends: This software recommends a touch input device.
Laptop:
   Incompatible
   recommends: This software recommends a display with its shortest edge 
   being << 1280 px in size, but the display of this device has 1280 px.
   recommends: This software recommends a touch input device.
Server:
   Incompatible
   requires: This software needs a display for graphical content.
   recommends: This software needs a display for graphical content.
   recommends: This software recommends a touch input device.
Tablet:
   Compatible (100%)
Handset:
   Compatible (100%)
I hope this is helpful for people. Happy metadata writing!

6 October 2023

Russ Allbery: Review: The Far Reaches

Review: The Far Reaches, edited by John Joseph Adams
Publisher: Amazon Original Stories
Copyright: June 2023
ISBN: 1-6625-1572-3
ISBN: 1-6625-1622-3
ISBN: 1-6625-1503-0
ISBN: 1-6625-1567-7
ISBN: 1-6625-1678-9
ISBN: 1-6625-1533-2
Format: Kindle
Pages: 219
Amazon has been releasing anthologies of original short SFF with various guest editors, free for Amazon Prime members. I previously tried Black Stars (edited by Nisi Shawl and Latoya Peterson) and Forward (edited by Blake Crouch). Neither were that good, but the second was much worse than the first. Amazon recently released a new collection, this time edited by long-standing SFF anthology editor John Joseph Adams and featuring a new story by Ann Leckie, which sounded promising enough to give them another chance. The definition of insanity is doing the same thing over and over again and expecting different results. As with the previous anthologies, each story is available separately for purchase or Amazon Prime "borrowing" with separate ISBNs. The sidebar cover is for the first in the sequence. Unlike the previous collections, which were longer novelettes or novellas, my guess is all of these are in the novelette range. (I did not do a word count.) If you're considering this anthology, read the Okorafor story ("Just Out of Jupiter's Reach"), consider "How It Unfolds" by James S.A. Corey, and avoid the rest. "How It Unfolds" by James S.A. Corey: Humans have invented a new form of physics called "slow light," which can duplicate any object that is scanned. The energy expense is extremely high, so the result is not a post-scarcity paradise. What the technology does offer, however, is a possible route to interstellar colonization: duplicate a team of volunteers and a ship full of bootstrapping equipment, and send copies to a bunch of promising-looking exoplanets. One of them might succeed. The premise is interesting. The twists Corey adds on top are even better. What can be duplicated once can be duplicated again, perhaps with more information. This is a lovely science fiction idea story that unfortunately bogs down because the authors couldn't think of anywhere better to go with it than relationship drama. I found the focus annoying, but the ideas are still very neat. (7) "Void" by Veronica Roth: A maintenance worker on a slower-than-light passenger ship making the run between Sol and Centauri unexpectedly is called to handle a dead body. A passenger has been murdered, two days outside the Sol system. Ace is in no way qualified to investigate the murder, nor is it her job, but she's watched a lot of crime dramas and she has met the victim before. The temptation to start poking around is impossible to resist. It's been a long time since I've read a story built around the differing experiences of time for people who stay on planets and people who spend most of their time traveling at relativistic speeds. It's a bit of a retro idea from an earlier era of science fiction, but it's still a good story hook for a murder mystery. None of the characters are that memorable and Roth never got me fully invested in the story, but this was still a pleasant way to pass the time. (6) "Falling Bodies" by Rebecca Roanhorse: Ira is the adopted son of a Genteel senator. He was a social experiment in civilizing the humans: rescue a human orphan and give him the best of Genteel society to see if he could behave himself appropriately. The answer was no, which is how Ira finds himself on Long Reach Station with a parole officer and a schooling opportunity, hopefully far enough from his previous mistakes for a second chance. Everyone else seems to like Rebecca Roanhorse's writing better than I do, and this is no exception. Beneath the veneer of a coming-of-age story with a twist of political intrigue, this is brutal, depressing, and awful, with an ending that needs a lot of content warnings. I'm sorry that I read it. (3) "The Long Game" by Ann Leckie: The Imperial Radch trilogy are some of my favorite science fiction novels of all time, but I am finding Leckie's other work a bit hit and miss. I have yet to read a novel of hers that I didn't like, but the short fiction I've read leans more heavily into exploring weird and alien perspectives, which is not my favorite part of her work. This story is firmly in that category: the first-person protagonist is a small tentacled alien creature, a bit like a swamp-dwelling octopus. I think I see what Leckie is doing here: balancing cynicism and optimism, exploring how lifespans influence thinking and planning, and making some subtle points about colonialism. But as a reading experience, I didn't enjoy it. I never liked any of the characters, and the conclusion of the story is the unsettling sort of main-character optimism that seems rather less optimistic to the reader. (4) "Just Out of Jupiter's Reach" by Nnedi Okorafor: K rm n scientists have found a way to grow living ships that can achieve a symbiosis with a human pilot, but the requirements for that symbiosis are very strict and hard to predict. The result was a planet-wide search using genetic testing to find the rare and possibly nonexistent matches. They found seven people. The deal was simple: spend ten years in space, alone, in her ship. No contact with any other human except at the midpoint, when the seven ships were allowed to meet up for a week. Two million euros a year, for as long as she followed the rules, and the opportunity to be part of a great experiment, providing data that will hopefully lead to humans becoming a spacefaring species. The core of this story is told during the seven days in the middle of the mission, and thus centers on people unfamiliar with human contact trying to navigate social relationships after five years in symbiotic ships that reshape themselves to their whims and personalities. The ships themselves link so that the others can tour, which offers both a good opportunity for interesting description and a concretized metaphor about meeting other people. I adore symbiotic spaceships, so this story had me at the premise. The surface plot is very psychological, and I didn't entirely click with it, but the sense of wonder vibes beneath that surface were wonderful. It also feels fresh and new: I've seen most of the ideas before, but not presented or written this way, or approached from quite this angle. Definitely the best story of the anthology. (8) "Slow Time Between the Stars" by John Scalzi: This, on the other hand, was a complete waste of time, redeemed only by being the shortest "story" in the collection. "Story" is generous, since there's only one character and a very dry, linear plot that exists only to make a philosophical point. "Speculative essay" may be closer. The protagonist is the artificial intelligence responsible for Earth's greatest interstellar probe. It is packed with a repository of all of human knowledge and the raw material to create life. Its mission is to find an exoplanet capable of sustaining that life, and then recreate it and support it. The plot, such as it is, follows the AI's decision to abandon that mission and cut off contact with Earth, for reasons that it eventually explains. Every possible beat of this story hit me wrong. The sense of wonder attaches to the most prosaic things and skips over the moments that could have provoked real wonder. The AI is both unbelievable and irritating, with all of the smug self-confidence of an Internet reply guy. The prose is overwrought in all the wrong places ("the finger of God, offering the spark to animate the dirt of another world" would totally be this AI's profile quote under their forum avatar). The only thing I liked about the story is the ethical point that it slowly meanders into, which I think I might agree with and at least find plausible. But it's delivered by the sort of character I would actively leave rooms to avoid, in a style that's about as engrossing as a tax form. Avoid. (2) Rating: 5 out of 10

30 September 2023

Ian Jackson: DKIM: rotate and publish your keys

If you are an email system administrator, you are probably using DKIM to sign your outgoing emails. You should be rotating the key regularly and automatically, and publishing old private keys. I have just released dkim-rotate 1.0; dkim-rotate is a tool to do this key rotation and publication. If you are an email user, your email provider ought to be doing this. If this is not done, your emails are non-repudiable , meaning that if they are leaked, anyone (eg, journalists, haters) can verify that they are authentic, and prove that to others. This is not desirable (for you). Non-repudiation of emails is undesirable This problem was described at some length in Matthew Green s article Ok Google: please publish your DKIM secret keys. Avoiding non-repudiation sounds a bit like lying. After all, I m advising creating a situation where some people can t verify that something is true, even though it is. So I m advocating casting doubt. Crucially, though, it s doubt about facts that ought to be private. When you send an email, that s between you and the recipient. Normally you don t intend for anyone, anywhere, who happens to get a copy, to be able to verify that it was really you that sent it. In practical terms, this verifiability has already been used by journalists to verify stolen emails. Associated Press provide a verification tool. Advice for all email users As a user, you probably don t want your emails to be non-repudiable. (Other people might want to be able to prove you sent some email, but your email system ought to serve your interests, not theirs.) So, your email provider ought to be rotating their DKIM keys, and publishing their old ones. At a rough guess, your provider probably isn t :-(. How to tell by looking at email headers A quick and dirty way to guess is to have a friend look at the email headers of a message you sent. (It is important that the friend uses a different email provider, since often DKIM signatures are not applied within a single email system.) If your friend sees a DKIM-Signature header then the message is DKIM signed. If they don t, then it wasn t. Most email traversing the public internet is DKIM signed nowadays; so if they don t see the header probably they re not looking using the right tools, or they re actually on the same email system as you. In messages signed by a system running dkim-rotate, there will also be a header about the key rotation, to notify potential verifiers of the situation. Other systems that avoid non-repudiation-through-DKIM might do something similar. dkim-rotate s header looks like this:
DKIM-Signature-Warning: NOTE REGARDING DKIM KEY COMPROMISE
 https://www.chiark.greenend.org.uk/dkim-rotate/README.txt
 https://www.chiark.greenend.org.uk/dkim-rotate/ae/aeb689c2066c5b3fee673355309fe1c7.pem
But an email system might do half of the job of dkim-rotate: regularly rotating the key would cause the signatures of old emails to fail to verify, which is a good start. In that case there probably won t be such a header. Testing verification of new and old messages You can also try verifying the signatures. This isn t entirely straightforward, especially if you don t have access to low-level mail tooling. Your friend will need to be able to save emails as raw whole headers and body, un-decoded, un-rendered. If your friend is using a traditional Unix mail program, they should save the message as an mbox file. Otherwise, ProPublica have instructions for attaching and transferring and obtaining the raw email. (Scroll down to How to Check DKIM and ARC .) Checking that recent emails are verifiable Firstly, have your friend test that they can in fact verify a DKIM signature. This will demonstrate that the next test, where the verification is supposed to fail, is working properly and fails for the right reasons. Send your friend a test email now, and have them do this on a Linux system:
    # save the message as test-email.mbox
    apt install libmail-dkim-perl # or equivalent on another distro
    dkimproxy-verify <test-email.mbox
You should see output containing something like this:
    originator address: ijackson@chiark.greenend.org.uk
    signature identity: @chiark.greenend.org.uk
    verify result: pass
    ...
If the output ontains verify result: fail (body has been altered) then probably your friend didn t manage to faithfully save the unalterered raw message. Checking old emails cannot be verified When you both have that working, have your friend find an older email of yours, from (say) month ago. Perform the same steps. Hopefully they will see something like this:
    originator address: ijackson@chiark.greenend.org.uk
    signature identity: @chiark.greenend.org.uk
    verify result: fail (bad RSA signature)
or maybe
    verify result: invalid (public key: not available)
This indicates that this old email can no longer be verified. That s good: it means that anyone who steals a copy, can t verify it either. If it s leaked, the journalist who receives it won t know it s genuine and unmodified; they should then be suspicious. If your friend sees verify result: pass, then they have verified that that old email of yours is genuine. Anyone who had a copy of the mail can do that. This is good for email thieves, but not for you. For email admins: announcing dkim-rotate 1.0 I have been running dkim-rotate 0.4 on my infrastructure, since last August. and I had entirely forgotten about it: it has run flawlessly for a year. I was reminded of the topic by seeing DKIM in other blog posts. Obviously, it is time to decreee that dkim-rotate is 1.0. If you re a mail system administrator, your users are best served if you use something like dkim-rotate. The package is available in Debian stable, and supports Exim out of the box, but other MTAs should be easy to support too, via some simple ad-hoc scripting. Limitation of this approach Even with this key rotation approach, emails remain nonrepudiable for a short period after they re sent - typically, a few days. Someone who obtains a leaked email very promptly, and shows it to the journalist (for example) right away, can still convince the journalist. This is not great, but at least it doesn t apply to the vast bulk of your email archive. There are possible email protocol improvements which might help, but they re quite out of scope for this article.
Edited 2023-10-01 00:20 +01:00 to fix some grammar


comment count unavailable comments

27 September 2023

Jonathan McDowell: onak 0.6.3 released

Yesterday I tagged a new version of onak, my OpenPGP compatible keyserver. I d spent a bit of time during DebConf doing some minor cleanups, in particular an annoying systemd socket activation issue I d been seeing. That turned out to be due completely failing to compile in the systemd support, even when it was detected. There was also a signature verification issue with certain Ed225519 signatures (thanks Antoine Beaupr for making me dig into that one), along with various code cleanups. I also worked on Stateless OpenPGP CLI support, which is something I talked about when I released 0.6.2. It isn t something that s suitable for release, but it is sufficient to allow running the OpenPGP interoperability test suite verification tests, which I m pleased to say all now pass. For the next release I m hoping the OpenPGP crypto refresh process will have completed, which at the very least will mean adding support for v6 packet types and fingerprints. The PostgreSQL DB backend could also use some love, and I might see if performance with SQLite3 has improved any. Anyway. Available locally or via GitHub.
0.6.3 - 26th September 2023
  • Fix systemd detection + socket activation
  • Add CMake checking for Berkeley DB
  • Minor improvements to keyd logging
  • Fix decoding of signature creation time
  • Relax version check on parsing signature + key packets
  • Improve HTML escaping
  • Handle failed database initialisation more gracefully
  • Fix bug with EDDSA signatures with top 8+ bits unset

21 September 2023

Jonathan Carter: DebConf23

I very, very nearly didn t make it to DebConf this year, I had a bad cold/flu for a few days before I left, and after a negative covid-19 test just minutes before my flight, I decided to take the plunge and travel. This is just everything in chronological order, more or less, it s the only way I could write it.

DebCamp I planned to spend DebCamp working on various issues. Very few of them actually got done, I spent the first few days in bed further recovering, took a covid-19 test when I arrived and after I felt better, and both were negative, so not sure what exactly was wrong with me, but between that and catching up with other Debian duties, I couldn t make any progress on catching up on the packaging work I wanted to do. I ll still post what I intended here, I ll try to take a few days to focus on these some time next month: Calamares / Debian Live stuff:
  • #980209 installation fails at the install boot loader phase
  • #1021156 calamares-settings-debian: Confusing/generic program names
  • #1037299 Install Debian -> Untrusted application launcher
  • #1037123 Minimal HD space required too small for some live images
  • #971003 Console auto-login doesn t work with sysvinit
At least Calamares has been trixiefied in testing, so there s that! Desktop stuff:
  • #1038660 please set a placeholder theme during development, different from any release
  • #1021816 breeze: Background image not shown any more
  • #956102 desktop-base: unwanted metadata within images
  • #605915 please mtheake it a non-native package
  • #681025 Put old themes in a new package named desktop-base-extra
  • #941642 desktop-base: split theme data files and desktop integrations in separate packages
The Egg theme that I want to develop for testing/unstable is based on Juliette Taka s Homeworld theme that was used for Bullseye. Egg, as in, something that hasn t quite hatched yet. Get it? (for #1038660) Debian Social:
  • Set up Lemmy instance
    • I started setting up a Lemmy instance before DebCamp, and meant to finish it.
  • Migrate PeerTube to new server
    • We got a new physical server for our PeerTube instance, we should have more space for growth and it would help us fix the streaming feature on our platform.
Loopy: I intended to get the loop for DebConf in good shape before I left, so that we can spend some time during DebCamp making some really nice content, unfortunately this went very tumbly, but at least we ended up with a loopy that kind of worked and wasn t too horrible. There s always another DebConf to try again, right?
So DebCamp as a usual DebCamp was pretty much a wash (fitting with all the rain we had?) for me, at least it gave me enough time to recover a bit for DebConf proper, and I had enough time left to catch up on some critical DPL duties and put together a few slides for the Bits from the DPL talk.

DebConf Bits From the DPL I had very, very little available time to prepare something for Bits fro the DPL, but I managed to put some slides together (available on my wiki page). I mostly covered:
  • A very quick introduction of myself (I ve done this so many times, it feels redundant giving my history every time), and some introduction on what it is that the DPL does. I declared my intent not to run for DPL again, and the reasoning behind it, and a few bits of information for people who may intend to stand for DPL next year.
  • The sentiment out there for the Debian 12 release (which has been very positive). How we include firmware by default now, and that we re saying goodbye to architectures both GNU/KFreeBSD and mipsel.
  • Debian Day and the 30th birthday party celebrations from local groups all over the world (and a reminder about the Local Groups BoF later in the week).
  • I looked forward to Debian 13 (trixie!), and how we re gaining riscv64 as a release architecture, as well as loongarch64, and that plans seem to be forming to fix 2k38 in Debian, and hopefully largely by the time the Trixie release comes by.
  • I made some comments about Enterprise Linux as people refer to the RHEL eco-system these days, how really bizarre some aspects of it is (like the kernel maintenance), and that some big vendors are choosing to support systems outside of that eco-system now (like CPanel now supporting Ubuntu too). I closed with the quote below from Ian Murdock, and assured the audience that if they want to go out and make money with Debian, they are more than welcome too.
Job Fair I walked through the hallway where the Job Fair was hosted, and enjoyed all the buzz. It s not always easy to get this right, but this year it was very active and energetic, I hope lots of people made some connections! Cheese & Wine Due to state laws and alcohol licenses, we couldn t consume alcohol from outside the state of Kerala in the common areas of the hotel (only in private rooms), so this wasn t quite as big or as fun as our usual C&W parties since we couldn t share as much from our individual countries and cultures, but we always knew that this was going to be the case for this DebConf, and it still ended up being alright. Day Trip I opted for the forest / waterfalls daytrip. It was really, really long with lots of time in the bus. I think our trip s organiser underestimated how long it would take between the points on the route (all in all it wasn t that far, but on a bus on a winding mountain road, it takes long). We left at 8:00 and only found our way back to the hotel around 23:30. Even though we arrived tired and hungry, we saw some beautiful scenery, animals and also met indigenous river people who talked about their struggles against being driven out of their place of living multiple times as government invests in new developments like dams and hydro power. Photos available in the DebConf23 public git repository. Losing a beloved Debian Developer during DebConf To our collective devastation, not everyone made it back from their day trips. Abraham Raji was out to the kayak day trip, and while swimming, got caught by a whirlpool from a drainage system. Even though all of us were properly exhausted and shocked in disbelief at this point, we had to stay up and make some tough decisions. Some initially felt that we had to cancel the rest of DebConf. We also had to figure out how to announce what happened asap both to the larger project and at DebConf in an official manner, while ensuring that due diligence took place and that the family is informed by the police first before making anything public. We ended up cancelling all the talks for the following day, with an address from the DPL in the morning to explain what had happened. Of all the things I ve ever had to do as DPL, this was by far the hardest. The day after that, talks were also cancelled for the morning so that we could attend his funeral. Dozens of DebConf attendees headed out by bus to go pay their final respects, many wearing the t-shirts that Abraham had designed for DebConf. A book of condolences was set up so that everyone who wished to could write a message on how they remembered him. The book will be kept by his family.
Today marks a week since his funeral, and I still feel very raw about it. And even though there was uncertainty whether DebConf should even continue after his death, in hindsight I m glad that everyone pushed forward. While we were all heart broken, it was also heart warming to see people care for each other in all of this. If anything, I think I needed more time at DebConf just to be in that warm aura of emotional support for just a bit longer. There are many people who I wanted to talk to who I barely even had a chance to see. Abraham, or Abru as he was called by some people (which I like because bru in Afrikaans is like bro in English, not sure if that s what it implied locally too) enjoyed artistic pursuits, but he was also passionate about knowledge transfer. He ran classes at DebConf both last year and this year (and I think at other local events too) where he taught people packaging via a quick course that he put together. His enthusiasm for Debian was contagious, a few of the people who he was mentoring came up to me and told me that they were going to see it through and become a DD in honor of him. I can t even remember how I reacted to that, my brain was already so worn out and stitching that together with the tragedy of what happened while at DebConf was just too much for me. I first met him in person last year in Kosovo, I already knew who he was, so I think we interacted during the online events the year before. He was just one of those people who showed so much promise, and I was curious to see what he d achieve in the future. Unfortunately, we was taken away from us too soon. Poetry Evening Later in the week we had the poetry evening. This was the first time I had the courage to recite something. I read Ithaka by C.P. Cavafy (translated by Edmund Keely). The first time I heard about this poem was in an interview with Julian Assange s wife, where she mentioned that he really loves this poem, and it caught my attention because I really like the Weezer song Return to Ithaka and always wondered what it was about, so needless to say, that was another rabbit hole at some point. Group Photo Our DebConf photographer organised another group photo for this event, links to high-res versions available on Aigar s website.
BoFs I didn t attend nearly as many talks this DebConf as I would ve liked (fortunately I can catch up on video, should be released soon), but I did make it to a few BoFs. In the Local Groups BoF, representatives from various local teams were present who introduced themselves and explained what they were doing. From memory (sorry if I left someone out), we had people from Belgium, Brazil, Taiwan and South Africa. We talked about types of events a local group could do (BSPs, Mini DC, sprints, Debian Day, etc. How to help local groups get started, booth kits for conferences, and setting up some form of calendar that lists important Debian events in a way that makes it easier for people to plan and co-ordinate. There s a mailing list for co-ordination of local groups, and the irc channel is -localgroups on oftc.
If you got one of these Cheese & Wine bags from DebConf, that s from the South African local group!
In the Debian.net BoF, we discussed the Debian.net hosting service, where Debian pays for VMs hosted for projects by individual DDs on Debian.net. The idea is that we start some form of census that monitors the services, whether they re still in use, whether the system is up to date, whether someone still cares for it, etc. We had some discussion about where the lines of responsibility are drawn, and we can probably make things a little bit more clear in the documentation. We also want to offer more in terms of backups and monitoring (currently DDs do get 500GB from rsync.net that could be used for backups of their services though). The intention is also to deploy some form of configuration management for some essentials across the hosts. We should also look at getting some sponsored hosting for this. In the Debian Social BoF, we discussed some services that need work / expansion. In particular, Matrix keeps growing at an increased rate as more users use it and more channels are bridged, so it will likely move to its own host with big disks soon. We might replace Pleroma with a fork called Akkoma, this will need some more home work and checking whether it s even feasible. Some services haven t really been used (like Writefreely and Plume), and it might be time to retire them. We might just have to help one or two users migrate some of their posts away if we do retire them. Mjolner seems to do a fine job at spam blocking, we haven t had any notable incidents yet. WordPress now has improved fediverse support, it s unclear whether it works on a multi-site instance yet, I ll test it at some point soon and report back. For upcoming services, we are implementing Lemmy and probably also Mobilizon. A request was made that we also look into Loomio. More Information Overload There s so much that happens at DebConf, it s tough to take it all in, and also, to find time to write about all of it, but I ll mention a few more things that are certainly worth of note. During DebConf, we had some people from the Kite Linux team over. KITE supplies the ICT needs for the primary and secondary schools in the province of Kerala, where they all use Linux. They decided to switch all of these to Debian. There was an ad-hoc BoF where locals were listening and fielding questions that the Kite Linux team had. It was great seeing all the energy and enthusiasm behind this effort, I hope someone will properly blog about this! I learned about the VGLUG Foundation, who are doing a tremendous job at promoting GNU/Linux in the country. They are also training up 50 people a year to be able to provide tech support for Debian. I came across the booth for Mostly Harmless, they liberate old hardware by installing free firmware on there. It was nice seeing all the devices out there that could be liberated, and how it can breathe new life into old harware.
Some hopefully harmless soldering.
Overall, the community and their activities in India are very impressive, and I wish I had more time to get to know everyone better. Food Oh yes, one more thing. The food was great. I tasted more different kinds of curry than I ever did in my whole life up to this point. The lunch on banana leaves was interesting, and also learning how to eat this food properly by hand (thanks to the locals who insisted on teaching me!), it was a fruitful experience? This might catch on at home too less dishes to take care of! Special thanks to the DebConf23 Team I think this may have been one of the toughest DebConfs to organise yet, and I don t think many people outside of the DebConf team knows about all the challenges and adversity this team has faced in organising it. Even just getting to the previous DebConf in Kosovo was a long and tedious and somewhat risky process. Through it all, they were absolute pro s. Not once did I see them get angry or yell at each other, whenever a problem came up, they just dealt with it. They did a really stellar job and I did make a point of telling them on the last day that everyone appreciated all the work that they did. Back to my nest I bought Dax a ball back from India, he seems to have forgiven me for not taking him along.
I ll probably take a few days soon to focus a bit on my bugs and catch up on my original DebCamp goals. If you made it this far, thanks for reading! And thanks to everyone for being such fantastic people.

15 September 2023

John Goerzen: How Gapped is Your Air?

Sometimes we want better-than-firewall security for things. For instance:
  1. An industrial control system for a municipal water-treatment plant should never have data come in or out
  2. Or, a variant of the industrial control system: it should only permit telemetry and monitoring data out, and nothing else in or out
  3. A system dedicated to keeping your GPG private keys secure should only have material to sign (or decrypt) come in, and signatures (or decrypted data) go out
  4. A system keeping your tax records should normally only have new records go in, but may on occasion have data go out (eg, to print a copy of an old record)
In this article, I ll talk about the high side (the high-security or high-sensitivity systems) and the low side (the lower-sensitivity or general-purpose systems). For the sake of simplicity, I ll assume the high side is a single machine, but it could as well be a whole network. Let s focus on examples 3 and 4 to make things simpler. Let s consider the primary concern to be data exfiltration (someone stealing your data), with a secondary concern of data integrity (somebody modifying or destroying your data). You might think the safest possible approach is Airgapped that is, there is literal no physical network connection to the machine at all. This help! But then, the problem becomes: how do we deal with the inevitable need to legitimately get things on or off of the system? As I wrote in Dead USB Drives Are Fine: Building a Reliable Sneakernet, by using tools such as NNCP, you can certainly create a sneakernet : using USB drives as transport. While this is a very secure setup, as with most things in security, it s less than perfect. The Wikipedia airgap article discusses some ways airgapped machines can still be exploited. It mentions that security holes relating to removable media have been exploited in the past. There are also other ways to get data out; for instance, Debian ships with gensio and minimodem, both of which can transfer data acoustically. But let s back up and think about why we think of airgapped machines as so much more secure, and what the failure modes of other approaches might be.

What about firewalls? You could very easily set up high-side machine that is on a network, but is restricted to only one outbound TCP port. There could be a local firewall, and perhaps also a special port on an external firewall that implements the same restrictions. A variant on this approach would be two computers connected directly by a crossover cable, though this doesn t necessarily imply being more secure. Of course, the concern about a local firewall is that it could potentially be compromised. An external firewall might too; for instance, if your credentials to it were on a machine that got compromised. This kind of dual compromise may be unlikely, but it is possible. We can also think about the complexity in a network stack and firewall configuration, and think that there may be various opportunities to have things misconfigured or buggy in a system of that complexity. Another consideration is that data could be sent at any time, potentially making it harder to detect. On the other hand, network monitoring tools are commonplace. On the other hand, it is convenient and cheap. I use a system along those lines to do my backups. Data is sent, gpg-encrypted and then encrypted again at the NNCP layer, to the backup server. The NNCP process on the backup server runs as an untrusted user, and dumps the gpg-encrypted files to a secure location that is then processed by a cron job using Filespooler. The backup server is on a dedicated firewall port, with a dedicated subnet. The only ports allowed out are for NNCP and NTP, and offsite backups. There is no default gateway. Not even DNS is permitted out (the firewall does the appropriate redirection). There is one pinhole allowed out, where a subset of the backup data is sent offsite. I initially used USB drives as transport, and it had no network connection at all. But there were disadvantages to doing this for backups particularly that I d have no backups for as long as I d forget to move the drives. The backup system also would have clock drift, and the offsite backup picture was more challenging. (The clock drift was a problem because I use 2FA on the system; a password, plus a TOTP generated by a Yubikey) This is pretty good security, I d think. What are the weak spots? Well, if there were somehow a bug in the NNCP client, and the remote NNCP were compromised, that could lead to a compromise of the NNCP account. But this itself would accomplish little; some other vulnerability would have to be exploited on the backup server, because the NNCP account can t see plaintext data at all. I use borgbackup to send a subset of backup data offsite over ssh. borgbackup has to run as root to be able to access all the files, but the ssh it calls runs as a separate user. A ssh vulnerability is therefore unlikely to cause much damage. If, somehow, the remote offsite system were compromised and it was able to exploit a security issue in the local borgbackup, that would be a problem. But that sounds like a remote possibility. borgbackup itself can t even be used over a sneakernet since it is not asynchronous. A more secure solution would probably be using something like dar over NNCP. This would eliminate the ssh installation entirely, and allow a complete isolation between the data-access and the communication stacks, and notably not require bidirectional communication. Logic separation matters too. My Roundup of Data Backup and Archiving Tools may be helpful here. Other attack vectors could be a vulnerability in the kernel s networking stack, local root exploits that could be combined with exploiting NNCP or borgbackup to gain root, or local misconfiguration that makes the sandboxes around NNCP and borgbackup less secure. Because this system is in my basement in a utility closet with no chairs and no good place for a console, I normally manage it via a serial console. While it s a dedicated line between the system and another machine, if the other machine is compromised or an adversary gets access to the physical line, credentials (and perhaps even data) could leak, albeit slowly. But we can do much better with serial lines. Let s take a look.

Serial lines Some of us remember RS-232 serial lines and their once-ubiquitous DB-9 connectors. Traditionally, their speed maxxed out at 115.2Kbps. Serial lines have the benefit that they can be a direct application-to-application link. In my backup example above, a serial line could directly link the NNCP daemon on one system with the NNCP caller on another, with no firewall or anything else necessary. It is simply up to those programs to open the serial device appropriately. This isn t perfect, however. Unlike TCP over Ethernet, a serial line has no inherent error checking. Modern programs such as NNCP and ssh assume that a lower layer is making the link completely clean and error-free for them, and will interpret any corruption as an attempt to tamper and sever the connection. However, there is a solution to that: gensio. In my page Using gensio and ser2net, I discuss how to run NNCP and ssh over gensio. gensio is a generic framework that can add framing, error checking, and retransmit to an unreliable link such as a serial port. It can also add encryption and authentication using TLS, which could be particularly useful for applications that aren t already doing that themselves. More traditional solutions for serial communications have their own built-in error correction. For instance, UUCP and Kermit both were designed in an era of noisy serial lines and might be an excellent fit for some use cases. The ZModem protocol also might be, though it offers somewhat less flexibility and automation than Kermit. I have found that certain USB-to-serial adapters by Gearmo will actually run at up to 2Mbps on a serial line! Look for the ones on their spec pages with a FTDI chipset rated at 920Kbps. It turns out they can successfully be driven faster, especially if gensio s relpkt is used. I ve personally verified 2Mbps operation (Linux port speed 2000000) on Gearmo s USA-FTDI2X and the USA-FTDI4X. (I haven t seen any single-port options from Gearmo with the 920Kbps chipset, but they may exist). Still, even at 2Mbps, speed may well be a limiting factor with some applications. If what you need is a console and some textual or batch data, it s probably fine. If you are sending 500GB backup files, you might look for something else. In theory, this USB to RS-422 adapter should work at 10Mbps, but I haven t tried it. But if the speed works, running a dedicated application over a serial link could be a nice and fairly secure option. One of the benefits of the airgapped approach is that data never leaves unless you are physically aware of transporting a USB stick. Of course, you may not be physically aware of what is ON that stick in the event of a compromise. This could easily be solved with a serial approach by, say, only plugging in the cable when you have data to transfer.

Data diodes A traditional diode lets electrical current flow in only one direction. A data diode is the same concept, but for data: a hardware device that allows data to flow in only one direction. This could be useful, for instance, in the tax records system that should only receive data, or the industrial system that should only send it. Wikipedia claims that the simplest kind of data diode is a fiber link with transceivers connected in only one direction. I think you could go one simpler: a serial cable with only ground and TX connected at one end, wired to ground and RX at the other. (I haven t tried this.) This approach does have some challenges:
  • Many existing protocols assume a bidirectional link and won t be usable
  • There is a challenge of confirming data was successfully received. For a situation like telemetry, maybe it doesn t matter; another observation will come along in a minute. But for sending important documents, one wants to make sure they were properly received.
In some cases, the solution might be simple. For instance, with telemetry, just writing out data down the serial port in a simple format may be enough. For sending files, various mitigations, such as sending them multiple times, etc., might help. You might also look into FEC-supporting infrastructure such as blkar and flute, but these don t provide an absolute guarantee. There is no perfect solution to knowing when a file has been successfully received if the data communication is entirely one-way.

Audio transport I hinted above that minimodem and gensio both are software audio modems. That is, you could literally use speakers and microphones, or alternatively audio cables, as a means of getting data into or out of these systems. This is pretty limited; it is 1200bps, and often half-duplex, and could literally be disrupted by barking dogs in some setups. But hey, it s an option.

Airgapped with USB transport This is the scenario I began with, and named some of the possible pitfalls above as well. In addition to those, note also that USB drives aren t necessarily known for their error-free longevity. Be prepared for failure.

Concluding thoughts I wanted to lay out a few things in this post. First, that simply being airgapped is generally a step forward in security, but is not perfect. Secondly, that both physical and logical separation matter. And finally, that while tools like NNCP can make airgapped-with-USB-drive-transport a doable reality, there are also alternatives worth considering especially serial ports, firewalled hard-wired Ethernet, data diodes, and so forth. I think serial links, in particular, have been largely forgotten these days. Note: This article also appears on my website, where it may be periodically updated.

12 September 2023

Valhalla's Things: How I Keep my Life in Git

Posted on September 12, 2023
git secret_cabal greet
After watching My life in git, after subversion, after CVS. from DebConf, I ve realized it s been a while since I talked about the way I keep everything1 I do in git, and I don t think I ve ever done it online, so it looked like a good time for a blog post. Beyond git itself (of course), I use a few git-related programs:
  • myrepos (also known as mr) to manage multiple git repositories with one command;
  • vcsh to make it easy to keep dot-files under git;
  • git annex to store media files (anything that is big and will not change);
  • etckeeper to keep an history of the /etc directory;
  • gitolite and cgit to host my git repositories;
and some programs that don t use git directly, but easily interact with it:
  • ansible to keep track of the system configuration of all machines;
  • lesana as a project tracker and journal and to inventory the things made of atoms that are hard 2 to store in git.
All of these programs are installed from Debian packages, on stable (plus rarely backports) or testing, depending on the machine. I m also grateful to the vcs-home people, who wrote most of the tools I use, and sometimes hang around their IRC channel. And now, on to what I m actually doing. With the git repositories I ve decided to err for too much granularity rather than too little3, so of course each project has its own repository, and so do different kinds of media files, dot-files that are related to different programs etc. Most of the repositories are hosted on two gitolite servers: one runs on the home server, for stuff that should remain private, and the other one is on my VPS for things that are public (or may become public in the future), and also has a web interface with cgit. Of course things where I m collaborating with other people are sometimes hosted elsewhere, mostly on salsa, sourcehut or on $DAYJOB related gitlab instances. The .mr directory is where everything is managed: I don t have a single .mrconfig file but a few different ones, that in turn load all files in a directory with the same name:
  • collections.mr for the media file annexes and inventories (split into different files, so that computers with little disk space can only get the inventories);
  • private.mr for stuff that should only go on my own personal machine, not on shared ones;
  • projects.mr for the actual projects, with different files for the kinds of projects (software, docs, packaging, crafts, etc.);
  • setup.mr with all of the vcsh repositories, including the one that tracks the mr files (I ll talk about the circular dependency later);
  • work.mr for repositories that are related to $DAYJOB.
Then there are the files in the .mr/machines directory, each one of which has the list of repositories that should be on every specific machine, including a generic workstation, but also specific machines such as e.g. the media center which has a custom set of repositories. The dot files from my home directory are kept in vcsh, so that it s easy to split them out into different repositories, and I m mostly used the simplest configuration described in the 30 Second How-to in its homepage; vcsh gives some commands to work on all vcsh repositories at the same time, but most of the time I work on a single repository, and use mr to act on more than one repo. The media collections are also pretty straightforward git-annex repositories, one for each kind of media (music, movies and other videos, e-books, pictures, etc.) and I don t use any auto-syncing features but simply copy and move files around between clones with the git annex copy, git annex move and git annex get commands. There isn t much to say about the project repositories (plain git), and I think that the way I use my own program lesana for inventories and project tracking is worth an article of its own, here I ll just say that the file format used has been designed (of course) to work nicely with git. On every machine I install etckeeper so that there is a history of the changes in the /etc directory, but that s only a local repository, not stored anywhere else, and is used mostly in case something breaks with an update or in similar situation. The authoritative source for the configuration of each machine is an ansible playbook (of course saved in git) which can be used to fully reconfigure the machine from a bare Debian installation. When such a reconfiguration from scratch happens, it will be in two stages: first a run of ansible does the system-wide configuration (including installing packages, creating users etc.), and then I login on the machine and run mr to set up my own home. Of course there is a chicken-and-egg problem in that I need the mr configuration to know where to get the mr configuration, and that is solved by having setup two vcsh repositories from an old tarball export: the one with the ssh configuration to access the repositories and the one with the mr files. So, after a machine has been configured with ansible what I ll actually do is to login, use vcsh pull to update those two repositories and then run mr to checkout everything else. And that s it, if you have questions on something feel free to ask me on the fediverse or via email (contacts are in the about page) Update (2023-09-12 17:00ish): The ~/.mr directory is not special for mr, it s just what I use and then I always run mr -c ~/.mr/some/suitable/file.mr, with the actual file being different whether I m registering a new repo or checking out / updating them. I could include some appropriate ~/.mr/machines/some_machine.mr in ~/.mrconfig, but I ve never bothered to do so, since it wouldn t cover all usecases anyway. Thanks to the person on #vcs-home@OFTC who asked me the question :)

  1. At least, everything that I made that is made of bits, and a diary and/or inventory of the things made of atoms.
  2. until we get a working replicator, I guess :D
  3. in time I ve consolidated a bit some of the repositories, e.g. merging the repositories for music from different sources (CD rips, legal downloads, etc.) into a single repository, but that only happened a few times, and usually I m fine with the excess of granularity.

8 September 2023

Reproducible Builds: Reproducible Builds in August 2023

Welcome to the August 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. The motivation behind the reproducible builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. If you are interested in contributing to the project, please visit our Contribute page on our website.

Rust serialisation library moving to precompiled binaries Bleeping Computer reported that Serde, a popular Rust serialization framework, had decided to ship its serde_derive macro as a precompiled binary. As Ax Sharma writes:
The move has generated a fair amount of push back among developers who worry about its future legal and technical implications, along with a potential for supply chain attacks, should the maintainer account publishing these binaries be compromised.
After intensive discussions, use of the precompiled binary was phased out.

Reproducible builds, the first ten years On August 4th, Holger Levsen gave a talk at BornHack 2023 on the Danish island of Funen titled Reproducible Builds, the first ten years which promised to contain:
[ ] an overview about reproducible builds, the past, the presence and the future. How it started with a small [meeting] at DebConf13 (and before), how it grew from being a Debian effort to something many projects work on together, until in 2021 it was mentioned in an executive order of the president of the United States. (HTML slides)
Holger repeated the talk later in the month at Chaos Communication Camp 2023 in Zehdenick, Germany: A video of the talk is available online, as are the HTML slides.

Reproducible Builds Summit Just another reminder that our upcoming Reproducible Builds Summit is set to take place from October 31st November 2nd 2023 in Hamburg, Germany. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. If you re interested in joining us this year, please make sure to read the event page, the news item, or the invitation email that Mattia Rizzolo sent out, which have more details about the event and location. We are also still looking for sponsors to support the event, so do reach out to the organizing team if you are able to help. (Also of note that PackagingCon 2023 is taking place in Berlin just before our summit, and their schedule has just been published.)

Vagrant Cascadian on the Sustain podcast Vagrant Cascadian was interviewed on the SustainOSS podcast on reproducible builds:
Vagrant walks us through his role in the project where the aim is to ensure identical results in software builds across various machines and times, enhancing software security and creating a seamless developer experience. Discover how this mission, supported by the Software Freedom Conservancy and a broad community, is changing the face of Linux distros, Arch Linux, openSUSE, and F-Droid. They also explore the challenges of managing random elements in software, and Vagrant s vision to make reproducible builds a standard best practice that will ideally become automatic for users. Vagrant shares his work in progress and their commitment to the last mile problem.
The episode is available to listen (or download) from the Sustain podcast website. As it happens, the episode was recorded at FOSSY 2023, and the video of Vagrant s talk from this conference (Breaking the Chains of Trusting Trust is now available on Archive.org: It was also announced that Vagrant Cascadian will be presenting at the Open Source Firmware Conference in October on the topic of Reproducible Builds All The Way Down.

On our mailing list Carles Pina i Estany wrote to our mailing list during August with an interesting question concerning the practical steps to reproduce the hello-traditional package from Debian. The entire thread can be viewed from the archive page, as can Vagrant Cascadian s reply.

Website updates Rahul Bajaj updated our website to add a series of environment variations related to reproducible builds [ ], Russ Cox added the Go programming language to our projects page [ ] and Vagrant Cascadian fixed a number of broken links and typos around the website [ ][ ][ ].

Software development In diffoscope development this month, versions 247, 248 and 249 were uploaded to Debian unstable by Chris Lamb, who also added documentation for the new specialize_as method and expanding the documentation of the existing specialize as well [ ]. In addition, Fay Stegerman added specialize_as and used it to optimise .smali comparisons when decompiling Android .apk files [ ], Felix Yan and Mattia Rizzolo corrected some typos in code comments [ , ], Greg Chabala merged the RUN commands into single layer in the package s Dockerfile [ ] thus greatly reducing the final image size. Lastly, Roland Clobus updated tool descriptions to mark that the xb-tool has moved package within Debian [ ].
reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, Vagrant Cascadian updated the packaging to be compatible with Tox version 4. This was originally filed as Debian bug #1042918 and Holger Levsen uploaded this to change to Debian unstable as version 0.7.26 [ ].

Distribution work In Debian, 28 reviews of Debian packages were added, 14 were updated and 13 were removed this month adding to our knowledge about identified issues. A number of issue types were added, including Chris Lamb adding a new timestamp_in_documentation_using_sphinx_zzzeeksphinx_theme toolchain issue.
In August, F-Droid added 25 new reproducible apps and saw 2 existing apps switch to reproducible builds, making 191 apps in total that are published with Reproducible Builds and using the upstream developer s signature. [ ]
Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In August, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Disable Debian live image creation jobs until an OpenQA credential problem has been fixed. [ ]
    • Run our maintenance scripts every 3 hours instead of every 2. [ ]
    • Export data for unstable to the reproducible-tracker.json data file. [ ]
    • Stop varying the build path, we want reproducible builds. [ ]
    • Temporarily stop updating the pbuilder.tgz for Debian unstable due to #1050784. [ ][ ]
    • Correctly document that we are not variying usrmerge. [ ][ ]
    • Mark two armhf nodes (wbq0 and jtx1a) as down; investigation is needed. [ ]
  • Misc:
    • Force reconfiguration of all Jenkins jobs, due to the recent rise of zombie processes. [ ]
    • In the node health checks, also try to restart failed ntpsec, postfix and vnstat services. [ ][ ][ ]
  • System health checks:
    • Detect Debian live build failures due to missing credentials. [ ][ ]
    • Ignore specific types of known zombie processes. [ ][ ]
In addition, Vagrant Cascadian updated the scripts to use a predictable build path that is consistent with the one used on buildd.debian.org. [ ][ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

30 August 2023

Dirk Eddelbuettel: RcppArmadillo 0.12.6.3.0 on CRAN: New Upstream Bugfix

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1092 other packages on CRAN, downloaded 30.3 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 549 times according to Google Scholar. This release brings bugfix upstream release 12.6.3. We skipped 12.6.2 at CRAN (as discussed in the previous release notes) as it only affected Armadillo-internal random-number generation (RNG). As we default to supplying the RNGs from R, this did not affect RcppArmadillo. The bug fixes in 12.6.3 are for csv reading which too will most likely be done by R tools for R users, but given two minor bugfix releases an update was in order. I ran the full reverse-depenency check against the now more than 1000 packages overnight: no issues. armadillo processing CRAN processed the package fully automatically as it has no issues, and nothing popped up in reverse-dependency checking. The set of changes for the last two RcppArmadillo releases follows.

Changes in RcppArmadillo version 0.12.6.3.0 (2023-08-28)
  • Upgraded to Armadillo release 12.6.3 (Cortisol Retox)
    • Fix for corner-case in loading CSV files with headers
    • For consistent file handling, all .load() functions now open text files in binary mode

Changes in RcppArmadillo version 0.12.6.2.0 (2023-08-08)
  • Upgraded to Armadillo release 12.6.2 (Cortisol Retox)
    • use thread-safe Mersenne Twister as the default RNG on all platforms
    • use unique RNG seed for each thread within multi-threaded execution (such as OpenMP)
    • explicitly document arma_rng::set_seed() and arma_rng::set_seed_random()
  • None of the changes above affect R use as RcppArmadillo connects the RNGs used by R to Armadillo

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

25 August 2023

Scarlett Gately Moore: KDE Snaps Weekly report, Debian recommenced!

Now that all the planets are fixed, please see what you missed here! /https://www.scarlettgatelymoore.dev/kde-a-day-in-the-life-the-kde-snapcrafter-part-2/ EXTREMELY IMPORTANT: I am still looking for a super awesome team lead for a super amazing project involving KDE and Snaps. Time is running out and well the KDE world will be a better a better place if this project goes through! I would like to clarify, this is a paid position! A current KDE developer would be ideal as it is a small team so your time will be split managing and coding alike. If you or anyone you know might be interested please contact me ASAP! Lots of news on the snap front 23.04.3 is now complete with new snaps! I know, just in time for 23.08.0. I have fixed some major issues in this release, 23.08 should go much quicker. Even quicker if my per repo snapcraft files gets approved! We have more PIM snaps, however I am waiting for reserved name approvals from the snap store. I was approached to decouple qt and frameworks sdk snaps and I have agreed for the fact that security updates are near impossible when new versions are released. Conversation here: https://forum.snapcraft.io/t/proposal-for-changes-to-kde-content-snap-and-extension I have started qt5 here https://github.com/ScarlettGatelyMoore/qt-5-15-10-snap And some exciting news I have started the KF6 content pack! I am doing like above and I am using the qt6 content pack Jarred Wilson has made. This is a requirement to start the plasma snap. Progress can be tracked here: https://github.com/ScarlettGatelyMoore/kf6-snap I am still have on on going request for snapcraft files in their respective repositories. While defending my request I have tested some options. Snapcraft files in the repository does allow for proper snap recipes in launchpad by mirroring the repo in launchpad -> create snap recipe. I created a recipe based on stable branch and it created and published the snap as expected. After being pointed to the flatpak workflow I discovered snaps has a similiar store feature with github, however I will need to create a github repo for each snap, which is tempting. I want to avoid duplication of snapcraft files, but I guess this is what they do for flatpak? I never received an answer. Snapcraft: Some more tidying of the qmake plugin and resolved some review conversations. Debian! I am back to getting things in Debian proper, starting with the golang packages I was working on for bubble-gum a cool console beautification application. As each one passes through NEW I will keep uploading. I will be checking in with the qt-kde team to see what needs doing. I am looking into seeing if openvoices is still a viable replacement for mycroft, hopefully all that work isn t wasted time. And finally, I do hate having to ask, but as we quickly approach September, I have not come close to enough to pay my pesky bills, required to have a place to live and eat! I am seeking employment as a backup if my amazing project falls through. I tried to enable ads, but that broke my planet feeds, I can t have that! So without further ado Anything helps! Also please share! Thanks for your consideration.

4 August 2023

John Goerzen: Try the Last Internet Kermit Server

$ grep kermit /etc/services
kermit          1649/tcp
What is this mysterious protocol? Who uses it and what is its story? This story is a winding one, beginning in 1981. Kermit is, to the best of my knowledge, the oldest actively-maintained software package with an original developer still participating. It is also a scripting language, an Internet server, a (scriptable!) SSH client, and a file transfer protocol. And my first use of it was talking to my HP-48GX calculator over a 9600bps serial link. Yes, that calculator had a Kermit server built in. But let s back up and talk about serial ports and Modems.

Serial Ports and Modems In my piece The PC & Internet Revolution in Rural America, I recently talked about getting a modem what an excitement it was to get one! I realize that many people today have never used a serial line or a modem, so let s briefly discuss. Before Ethernet and Wifi took off in a big way, in the 1990s-2000s, two computers would talk to each other over a serial line and a modem. By modern standards, these were slow; 300bps was a common early speed. They also (at least in the beginning) had no kind of error checking. Characters could be dropped or changed. Sometimes even those speeds were faster than the receiving device could handle. Some serial links were 7-bit, and wouldn t even pass all 7-bit characters; for instance, sending a Ctrl-S could lock up a remote until you sent Ctrl-Q. And computers back in the 1970s and 1980s weren t as uniform as they are now. They used different character sets, different line endings, and even had different notions of what a file is. Today s notion of a file as whatever set of binary bytes an application wants it to be was by no means universal; some systems treated a file as a set of fixed-length records, for instance. So there were a lot of challenges in reliably moving files between systems. Kermit was introduced to reliably move files between systems using serial lines, automatically working around the varieties of serial lines, detecting errors and retransmitting, managing transmit speeds, and adapting between architectures as appropriate. Quite a task! And perhaps this explains why it was supported on a calculator with a primitive CPU by today s standards. Serial communication, by the way, is still commonplace, though now it isn t prominent in everyone s home PC setup. It s used a lot in industrial equipment, avionics, embedded systems, and so forth. The key point about serial lines is that they aren t inherently multiplexed or packetized. Whereas an Ethernet network is designed to let many dozens of applications use it at once, a serial line typically runs only one (unless it is something like PPP, which is designed to do multiplexing over the serial line). So it become useful to be able to both log in to a machine and transfer files with it. That is, incidentally, still useful today.

Kermit and XModem/ZModem I wondered: why did we end up with two diverging sets of protocols, created at about the same time? The Kermit website has the answer: essentially, BBSs could assume 8-bit clean connections, so XModem and ZModem had much less complexity to worry about. Kermit, on the other hand, was highly flexible. Although ZModem came out a few years before Kermit had its performance optimizations, by about 1993 Kermit was on par or faster than ZModem.

Beyond serial ports As LANs and the Internet came to be popular, people started to use telnet (and later ssh) to connect to remote systems, rather than serial lines and modems. FTP was an early way to transfer files across the Internet, but it had its challenges. Kermit added telnet support, as well as later support for ssh (as a wrapper around the ssh command you already know). Now you could easily log in to a machine and exchange files with it without missing a beat. And so it was that the Internet Kermit Service Daemon (IKSD) came into existence. It allows a person to set up a Kermit server, which can authenticate against local accounts or present anonymous access akin to FTP. And so I established the quux.org Kermit Server, which runs the Unix IKSD (part of the Debian ckermit package).

Trying Out the quux.org Kermit Server There are more instructions on the quux.org Kermit Server page! You can connect to it using either telnet or the kermit program. I won t duplicate all of the information here, but here s what it looks like to connect:
$ kermit
C-Kermit 10.0 Beta.08, 15 Dec 2022, for Linux+SSL (64-bit)
 Copyright (C) 1985, 2022,
  Trustees of Columbia University in the City of New York.
  Open Source 3-clause BSD license since 2011.
Type ? or HELP for help.
(/tmp/t/) C-Kermit>iksd /user:anonymous kermit.quux.org
 DNS Lookup...  Trying 135.148.101.37...  Reverse DNS Lookup... (OK)
Connecting to host glockenspiel.complete.org:1649
 Escape character: Ctrl-\ (ASCII 28, FS): enabled
Type the escape character followed by C to get back,
or followed by ? to see other options.
----------------------------------------------------

 >>> Welcome to the Internet Kermit Service at kermit.quux.org <<<

To log in, use 'anonymous' as the username, and any non-empty password

Internet Kermit Service ready at Fri Aug  4 22:32:17 2023
C-Kermit 10.0 Beta.08, 15 Dec 2022
kermit

Enter e-mail address as Password: [redacted]

Anonymous login.

You are now connected to the quux kermit server.

Try commands like HELP, cd gopher, dir, and the like.  Use INTRO
for a nice introduction.

(~/) IKSD>
You can even recursively download the entire Kermit mirror: over 1GB of files!

Conclusions So, have fun. Enjoy this experience from the 1980s. And note that Kermit also makes a better ssh client than ssh in a lot of ways; see ideas on my Kermit page. This page also has a permanent home on my website, where it may be periodically updated.

23 July 2023

Dirk Eddelbuettel: #41: Another r2u Example Really Simple CI

Welcome to the 41th post in the $R^4 series. Just as the previous post illustrated r2u use to empower interactive Google Colab sessions, today we want to look at continuous integration via GitHub Actions. Actions are very powerful, yet also intimidating and complex. How does one know what to run? How does ensure requirements are installed? What does these other actions do? Here we offer a much simpler yet fully automatic solution. It takes advantage of the fact that r2u integrates fully and automatically with the system, here apt, without us having to worry about the setup. One way to make this very easy is the use of the Rocker containers for r2u. They already include the few lines of simple (and scriptable) setup, and have bspm setup so that R commands to install packages dispatch to apt and will bring all required dependencies automatically and easily. With that the required yaml file for an action can be as simple as this:
name: r2u

on:
  push:
  pull_request:
  release:

jobs:
  ci:
    runs-on: ubuntu-latest
    container:
      image: rocker/r2u:latest
    steps:
      - uses: actions/checkout@v3
      - name: SessionInfo
        run: R -q -e 'sessionInfo()'
      #- name: System Dependencies
      #  # can be used to install e.g. cmake or other build dependencies
      #  run: apt update -qq && apt install --yes --no-install-recommends cmake git
      - name: Package Dependencies
        run: R -q -e 'remotes::install_deps(".", dependencies=TRUE)'
      - name: Build Package
        run: R CMD build --no-build-vignettes --no-manual .
      - name: Check Package
        run: R CMD check --no-vignettes --no-manual $(ls -1tr *.tar.gz   tail -1)
There are only a few key components here. First, we have the on block where for simplicity we select pushes, pull requests and releases. One could reduce this to just pushes by removing or commenting out the next two lines. Many further refinements are possible and documented but not reqired. Second, the jobs section and its sole field ci saythat we are running this CI on Ubuntu in its latest release. Importantly we then also select the rocker container for r2 meaning that we explicitly select running in this container (which happens to be an extension and refinement of ubuntu-latest). The latest tag points to the most recent LTS release, currently jammy aka 22.04. This choice also means that our runs are limited to Ubuntu and exclude macOS and Windows. That is a choice: not every CI task needs to burn extra (and more expensive) cpu cycles on the alternative OS, yet those can always be added via other yaml files possibly conditioned on fewer runs (say: only pull requests) if needed. Third, we have the basic sequence of steps. We check out the repo this file is part of (very standard). After that we ask R show the session info in case we need to troubleshoot. (These two lines could be commented out.) Next we show a commented-out segment we needed in another repo where we needed to add cmake and git as the package in question required local compilation during build. Such a need is fairly rare, but as shown can be be accomodated easily while taking advantage of the rich development infrastructure provided by Ubuntu. But the step should be optional for most R packages so it is commented out here. The next step uses the remotes package to look at the DESCRIPTION file and install all dependencies which, thanks to r2u and bspm, will use all Ubuntu binaries making it both very fast, very easy, and generally failsafe. Finally we do the two standard steps of building the source package and checking it (while omitting vignettes and the (pdf) manual as the container does not bother with a full texlive installation this could be altered if desired in a derived container). And that s it! The startup cost is a few seconds to pull the container, plus a few more seconds for dependencies and let us recall that e.g. the entire tidyverse installs all one hundred plus packages in about twenty seconds as shown in earlier post. After that the next cost is generally just what it takes to build and check your package once all requirements are in. To use such a file for continuous integration, we can install it in the .github/workflows/ directory of a repository. One filename I have used is .github/workflows/r2u.yaml making it clear what this does and how. More information about r2u is at its site, and we answered some question in issues, and at stackoverflow. More questions are always welcome! If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

18 July 2023

Sergio Talens-Oliag: Testing cilium with k3d and kind

This post describes how to deploy cilium (and hubble) using docker on a Linux system with k3d or kind to test it as CNI and Service Mesh. I wrote some scripts to do a local installation and evaluate cilium to use it at work (in fact we are using cilium on an EKS cluster now), but I thought it would be a good idea to share my original scripts in this blog just in case they are useful to somebody, at least for playing a little with the technology.

InstallationFor each platform we are going to deploy two clusters on the same docker network; I ve chosen this model because it allows the containers to see the addresses managed by metallb from both clusters (the idea is to use those addresses for load balancers and treat them as if they were public). The installation(s) use cilium as CNI, metallb for BGP (I tested the cilium options, but I wasn t able to configure them right) and nginx as the ingress controller (again, I tried to use cilium but something didn t work either). To be able to use the previous components some default options have been disabled on k3d and kind and, in the case of k3d, a lot of k3s options (traefik, servicelb, kubeproxy, network-policy, ) have also been disabled to avoid conflicts. To use the scripts we need to install cilium, docker, helm, hubble, k3d, kind, kubectl and tmpl in our system. After cloning the repository, the sbin/tools.sh script can be used to do that on a linux-amd64 system:
$ git clone https://gitea.mixinet.net/blogops/cilium-docker.git
$ cd cilium-docker
$ ./sbin/tools.sh apps
Once we have the tools, to install everything on k3d (for kind replace k3d by kind) we can use the sbin/cilium-install.sh script as follows:
$ # Deploy first k3d cluster with cilium & cluster-mesh
$ ./sbin/cilium-install.sh k3d 1 full
[...]
$ # Deploy second k3d cluster with cilium & cluster-mesh
$ ./sbin/cilium-install.sh k3d 2 full
[...]
$ # The 2nd cluster-mesh installation connects the clusters
If we run the command cilium status after the installation we should get an output similar to the one seen on the following screenshot:
cilium status
The installation script uses the following templates:
Once we have finished our tests we can remove the installation using the sbin/cilium-remove.sh script.

Some notes about the configuration
  • As noted on the documentation, the cilium deployment needs to mount the bpffs on /sys/fs/bpf and cgroupv2 on /run/cilium/cgroupv2; that is done automatically on kind, but fails on k3d because the image does not include bash (see this issue).To fix it we mount a script on all the k3d containers that is executed each time they are started (the script is mounted as /bin/k3d-entrypoint-cilium.sh because the /bin/k3d-entrypoint.sh script executes the scripts that follow the pattern /bin/k3d-entrypoint-*.sh before launching the k3s daemon). The source code of the script is available here.
  • When testing the multi-cluster deployment with k3d we have found issues with open files, looks like they are related to inotify (see this page on the kind documentation); adding the following to the /etc/sysctl.conf file fixed the issue:
    # fix inotify issues with docker & k3d
    fs.inotify.max_user_watches = 524288
    fs.inotify.max_user_instances = 512
  • Although the deployment theoretically supports it, we are not using cilium as the cluster ingress yet (it did not work, so it is no longer enabled) and we are also ignoring the gateway-api for now.
  • The documentation uses the cilium cli to do all the installations, but I noticed that following that route the current version does not work right with hubble (it messes up the TLS support, there are some notes about the problems on this cilium issue), so we are deploying with helm right now.The problem with the helm approach is that there is no official documentation on how to install the cluster mesh with it (there is a request for documentation here), so we are using the cilium cli for now and it looks that it does not break the hubble configuration.

TestsTo test cilium we have used some scripts & additional config files that are available on the test sub directory of the repository:
  • cilium-connectivity.sh: a script that runs the cilium connectivity test for one cluster or in multi cluster mode (for mesh testing).If we export the variable HUBBLE_PF=true the script executes the command cilium hubble port-forward before launching the tests.
  • http-sw.sh: Simple tests for cilium policies from the cilium demo; the script deploys the Star Wars demo application and allows us to add the L3/L4 policy or the L3/L4/L7 policy, test the connectivity and view the policies.
  • ingress-basic.sh: This test is for checking the ingress controller, it is prepared to work against cilium and nginx, but as explained before the use of cilium as an ingress controller is not working as expected, so the idea is to call it with nginx always as the first argument for now.
  • mesh-test.sh: Tool to deploy a global service on two clusters, change the service affinity to local or remote, enable or disable if the service is shared and test how the tools respond.

Running the testsThe cilium-connectivity.sh executes the standard cilium tests:
$ ./test/cilium-connectivity.sh k3d 12
   Monitor aggregation detected, will skip some flow validation
steps
  [k3d-cilium1] Creating namespace cilium-test for connectivity
check...
  [k3d-cilium2] Creating namespace cilium-test for connectivity
check...
[...]
  All 33 tests (248 actions) successful, 2 tests skipped,
0 scenarios skipped.
To test how the cilium policies work use the http-sw.sh script:
kubectx k3d-cilium2 # (just in case)
# Create test namespace and services
./test/http-sw.sh create
# Test without policies (exaust-port fails by design)
./test/http-sw.sh test
# Create and view L3/L4 CiliumNetworkPolicy
./test/http-sw.sh policy-l34
# Test policy (no access from xwing, exaust-port fails)
./test/http-sw.sh test
# Create and view L7 CiliumNetworkPolicy
./test/http-sw.sh policy-l7
# Test policy (no access from xwing, exaust-port returns 403)
./test/http-sw.sh test
# Delete http-sw test
./test/http-sw.sh delete
And to see how the service mesh works use the mesh-test.sh script:
# Create services on both clusters and test
./test/mesh-test.sh k3d create
./test/mesh-test.sh k3d test
# Disable service sharing from cluster 1 and test
./test/mesh-test.sh k3d svc-shared-false
./test/mesh-test.sh k3d test
# Restore sharing, set local affinity and test
./test/mesh-test.sh k3d svc-shared-default
./test/mesh-test.sh k3d svc-affinity-local
./test/mesh-test.sh k3d test
# Delete deployment from cluster 1 and test
./test/mesh-test.sh k3d delete-deployment
./test/mesh-test.sh k3d test
# Delete test
./test/mesh-test.sh k3d delete

10 July 2023

Lukas M rdian: Netplan and systemd-networkd on Debian Bookworm

Debian s cloud-images are using systemd-networkd as their default network stack in Bookworm. A slim and feature rich networking daemon that comes included with Systemd itself. Debian s cloud-images are deploying Netplan on top of this as an easy-to-use, declarative control layer. If you want to experiment with systemd-networkd and Netplan on Debian, this can be done easily in QEMU using the official images. To start, you need to download the relevant .qcow2 Debian cloud-image from: https://cloud.debian.org/images/cloud/bookworm/latest/
$ wget https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2

Prepare a cloud image Next, you need to prepare some configuration files for cloud-init and Netplan, to prepare a data-source (seed.img) for your local cloud-image.
$ cat > meta.yaml <<EOF
instance-id: debian01
local-hostname: cloudimg
EOF
$ cat > user.yaml <<EOF
#cloud-config
ssh_pwauth: true
password: test
chpasswd:
  expire: false
EOF
$ cat > netplan.yaml <<EOF
network:
  version: 2
  ethernets:
    id0:
      match:
        macaddress: "ca:fe:ca:fe:00:aa"
      dhcp4: true
      dhcp6: true
      set-name: lan0
EOF
Once all configuration is prepared, you can create the local data-source image, using the cloud-localds tool from the cloud-image-utils package:
$ cloud-localds --network-config=netplan.yaml seed.img user.yaml meta.yaml

Launch the local VM Now, everything is prepared to launch a QEMU VM with two NICs and do some experimentation! The following command will launch an ephemeral environment for you, keeping the original Debian cloud-image untouched. If you want to preserve any changes on disk, you can remove the trailing -snapshot parameter.
$ qemu-system-x86_64 \
  -machine accel=kvm,type=q35 \
  -cpu host \
  -m 2G \
  -device virtio-net-pci,netdev=net0,mac=ca:fe:ca:fe:00:aa \
  -netdev user,id=net0,hostfwd=tcp::2222-:22 \
  -nic user,model=virtio-net-pci,mac=f0:0d:ca:fe:00:bb \
  -drive if=virtio,format=qcow2,file=debian-12-generic-amd64.qcow2 \
  -drive if=virtio,format=raw,file=seed.img -snapshot
We set up the default debian user account through cloud-init s user-data configuration above, so you can now login to the system, using that user with the (very unsafe!) password test .
$ ssh -o "StrictHostKeyChecking=no" -o "UserKnownHostsFile=/dev/null" -p 2222 debian@localhost # password: test

Experience Netplan and systemd-networkd Once logged in successfully, you can execute the netplan status command to check the system s network configuration, as configured through cloud-init s netplan.yaml passthrough. So you ve already used Netplan at this point implicitly and it did all the configuration of systemd-networkd for you in the background!
debian@cloudimg:~$ sudo netplan status -a
     Online state: online
    DNS Addresses: 10.0.2.3 (compat)
       DNS Search: .
   1: lo ethernet UNKNOWN/UP (unmanaged)
      MAC Address: 00:00:00:00:00:00
        Addresses: 127.0.0.1/8
                   ::1/128
           Routes: ::1 metric 256
   2: enp0s2 ethernet DOWN (unmanaged)
      MAC Address: f0:0d:ca:fe:00:bb (Red Hat, Inc.)
   3: lan0 ethernet UP (networkd: id0)
      MAC Address: ca:fe:ca:fe:00:aa (Red Hat, Inc.)
        Addresses: 10.0.2.15/24 (dhcp)
                   fec0::c8fe:caff:fefe:aa/64
                   fe80::c8fe:caff:fefe:aa/64 (link)
    DNS Addresses: 10.0.2.3
           Routes: default via 10.0.2.2 from 10.0.2.15 metric 100 (dhcp)
                   10.0.2.0/24 from 10.0.2.15 metric 100 (link)
                   10.0.2.2 from 10.0.2.15 metric 100 (dhcp, link)
                   10.0.2.3 from 10.0.2.15 metric 100 (dhcp, link)
                   fe80::/64 metric 256
                   fec0::/64 metric 100 (ra)
                   default via fe80::2 metric 100 (ra)
As you can see from this output, the lan0 interface is configured via the id0 Netplan ID to be managed by systemd-networkd. Compare this data to the netplan.yaml file above, the networkctl output, the local Netplan configuration in /etc/netplan/ and the auto-generated systemd-networkd configuration.
debian@cloudimg:~$ networkctl 
IDX LINK   TYPE     OPERATIONAL SETUP     
  1 lo     loopback carrier     unmanaged
  2 enp0s2 ether    off         unmanaged
  3 lan0   ether    routable    configured
3 links listed.
debian@cloudimg:~$ cat /etc/netplan/50-cloud-init.yaml 
# [...]
network:
    ethernets:
        id0:
            dhcp4: true
            dhcp6: true
            match:
                macaddress: ca:fe:ca:fe:00:aa
            set-name: lan0
    version: 2

debian@cloudimg:~$ ls -l /run/systemd/network/
total 8
-rw-r--r-- 1 root root  78 Jul  5 15:23 10-netplan-id0.link
-rw-r--r-- 1 root root 137 Jul  5 15:23 10-netplan-id0.network
Now you can go ahead and try something more advanced, like link aggregation, using the second NIC that you configured for this QEMU VM and explore all the possibilities of Netplan on Debian, by checking the Netplan YAML documentation.

28 June 2023

Russ Allbery: Review: Translation State

Review: Translation State, by Ann Leckie
Publisher: Orbit
Copyright: June 2023
ISBN: 0-316-29024-6
Format: Kindle
Pages: 354
Translation State is a science fiction novel set in the same universe as the Imperial Radch series and Provenance. It is not truly a sequel of any of those books, but as with Provenance, it has significant spoilers for the conclusion of Ancillary Mercy. Provenance takes place earlier, but it's plot is unrelated as far as I can recall. Enea has spent much of hir adult life living with hir difficult and somewhat abusive grandmanan and, in recent years, running her household. Now, Grandmanan is dead, and the relatives who have been waiting to inherit Grandmanan's wealth are descending like a flock of vultures and treating hir like a servant. Enea can barely stand to be around them. It is therefore somewhat satisfying to watch their reactions when they discover that there is no estate. Grandmanan had been in debt and sold her family title to support herself for the rest of her life. Enea will receive an allowance and an arranged job that expects a minimum of effort. Everyone else gets nothing. It's still a wrenching dislocation from everything Enea has known, but at least sie can relax, travel, and not worry about money. Enea's new job for the Office of Diplomacy is to track down a fugitive who disappeared two hundred years earlier. The request came from the Radchaai Translators Office, the agency responsible for the treaty with the alien Presger, and was resurrected due to the upcoming conclave to renegotiate the treaty. No one truly expects Enea to find this person or any trace of them. It's a perfect quiet job to reward hir with travel and a stipend for putting up with Grandmanan all these years. This plan lasts until Enea's boredom and sense of duty get the better of hir. Enea is one of three viewpoint characters. Reet lives a quiet life in which he only rarely thinks about murdering people. He has a menial job in Rurusk Station, at least until he falls in with an ethnic club that may be a cover for more political intentions. Qven... well, Qven is something else entirely. Provenance started with some references to the Imperial Radch trilogy but then diverged into its own story. Translation State does the opposite. It starts as a cozy pseudo-detective story following Enea and a slice-of-life story following Reet, interspersed with baffling chapters from Qven, but by the end of the book the characters are hip-deep in the trilogy aftermath. It's not the direct continuation of the political question of the trilogy that I'm still partly hoping for, but it's adjacent. As you might suspect from the title, this story is about Presger Translators. Exactly how is not entirely obvious at the start, but it doesn't take long for the reader to figure it out. Leckie fills in a few gaps in the world-building and complicates (but mostly retains) the delightfully askew perspective Presger Translators have on the world. For me, though, the best part of the book was the political maneuvering once the setup is complete and all the characters are in the same place. The ending, unfortunately, dragged a little bit; the destination of the story was obvious but delayed by characters not talking to each other. I tend to find this irritating, but I know tastes differ. I was happily enjoying Translation State but thinking that it didn't suck me in as much as the original trilogy, and even started wondering if I'd elevated the Imperial Radch trilogy too high in my memory. Then an AI ship showed up and my brain immediately got fully invested in the story. I'm very happy to get whatever other stories in this universe Leckie is willing to write, but I would have been even happier if a ship appeared as more than a supporting character. To the surprise of no one who reads my reviews, I clearly have strong preferences in protagonists. This wasn't one of my favorites, but it was a solidly good book, and I will continue to read everything Ann Leckie writes. If you liked Provenance, I think you'll like this one as well. We once again get a bit more information about the aliens in this universe, and this time around we get more Radchaai politics, but the overall tone is closer to Provenance. Great powers are in play, but the focus is mostly on the smaller scale. Recommended, but of course read the Imperial Radch trilogy first. Note that Translation State uses a couple of sets of neopronouns to represent different gender systems. My brain still struggles with parsing them grammatically, but this book was good practice. It was worth the effort to watch people get annoyed at the Radchaai unwillingness to acknowledge more than one gender. Content warning: Cannibalism (Presger Translators are very strange), sexual assault. Rating: 8 out of 10

20 June 2023

Vasudev Kamath: Notes: Experimenting with ZRAM and Memory Over commit

Introduction The ZRAM module in the Linux kernel creates a memory-backed block device that stores its content in a compressed format. It offers users the choice of compression algorithms such as lz4, zstd, or lzo. These algorithms differ in compression ratio and speed, with zstd providing the best compression but being slower, while lz4 offers higher speed but lower compression.
Using ZRAM as Swap One interesting use case for ZRAM is utilizing it as swap space in the system. There are two utilities available for configuring ZRAM as swap: zram-tools and systemd-zram-generator. However, Debian Bullseye lacks systemd-zram-generator, making zram-tools the only option for Bullseye users. While it's possible to use systemd-zram-generator by self-compiling or via cargo, I preferred using tools available in the distribution repository due to my restricted environment.
Installation The installation process is straightforward. Simply execute the following command:
apt-get install zram-tools
Configuration The configuration involves modifying a simple shell script file /etc/default/zramswap sourced by the /usr/bin/zramswap script. Here's an example of the configuration I used:
# Compression algorithm selection
# Speed: lz4 > zstd > lzo
# Compression: zstd > lzo > lz4
# This is not inclusive of all the algorithms available in the latest kernels
# See /sys/block/zram0/comp_algorithm (when the zram module is loaded) to check
# the currently set and available algorithms for your kernel [1]
# [1]  https://github.com/torvalds/linux/blob/master/Documentation/blockdev/zram.txt#L86
ALGO=zstd
# Specifies the amount of RAM that should be used for zram
# based on a percentage of the total available memory
# This takes precedence and overrides SIZE below
PERCENT=30
# Specifies a static amount of RAM that should be used for
# the ZRAM devices, measured in MiB
# SIZE=256000
# Specifies the priority for the swap devices, see swapon(2)
# for more details. A higher number indicates higher priority
# This should probably be higher than hdd/ssd swaps.
# PRIORITY=100
I chose zstd as the compression algorithm for its superior compression capabilities. Additionally, I reserved 30% of memory as the size of the zram device. After modifying the configuration, restart the zramswap.service to activate the swap:
systemctl restart zramswap.service
Using systemd-zram-generator For Debian Bookworm users, an alternative option is systemd-zram-generator. Although zram-tools is still available in Debian Bookworm, systemd-zram-generator offers a more integrated solution within the systemd ecosystem. Below is an example of the translated configuration for systemd-zram-generator, located at /etc/systemd/zram-generator.conf:
# This config file enables a /dev/zram0 swap device with the following
# properties:
# * size: 50% of available RAM or 4GiB, whichever is less
# * compression-algorithm: kernel default
#
# This device's properties can be modified by adding options under the
# [zram0] section below. For example, to set a fixed size of 2GiB, set
#  zram-size = 2GiB .
[zram0]
zram-size = ceil(ram * 30/100)
compression-algorithm = zstd
swap-priority = 100
fs-type = swap
After making the necessary changes, reload systemd and start the systemd-zram-setup@zram0.service:
systemctl daemon-reload
systemctl start systemd-zram-setup@zram0.service
The systemd-zram-generator creates the zram device by loading the kernel module and then creates a systemd.swap unit to mount the zram device as swap. In this case, the swap file is called zram0.swap.
Checking Compression and Details To verify the effectiveness of the swap configuration, you can use the zramctl command, which is part of the util-linux package. Alternatively, the zramswap utility provided by zram-tools can be used to obtain the same output. During my testing with synthetic memory load created using stress-ng vm class I found that I can reach upto 40% compression ratio.
Memory Overcommit Another use case I was looking for is allowing the launching of applications that require more memory than what is available in the system. By default, the Linux kernel attempts to estimate the amount of free memory left on the system when user space requests more memory (vm.overcommit_memory=0). However, you can change this behavior by modifying the sysctl value for vm.overcommit_memory to 1. To demonstrate this, I ran a test using stress-ng to request more memory than the system had available. As expected, the Linux kernel refused to allocate memory, and the stress-ng process could not proceed.
free -tg                                                                                                                                                                                          (Mon,Jun19) 
                total        used        free      shared  buff/cache   available
 Mem:              31          12          11           3          11          18
 Swap:             10           2           8
 Total:            41          14          19
sudo stress-ng --vm=1 --vm-bytes=50G -t 120                                                                                                                                                       (Mon,Jun19) 
 stress-ng: info:  [1496310] setting to a 120 second (2 mins, 0.00 secs) run per stressor
 stress-ng: info:  [1496310] dispatching hogs: 1 vm
 stress-ng: info:  [1496312] vm: gave up trying to mmap, no available memory, skipping stressor
 stress-ng: warn:  [1496310] vm: [1496311] aborted early, out of system resources
 stress-ng: info:  [1496310] vm:
 stress-ng: warn:  [1496310]         14 System Management Interrupts
 stress-ng: info:  [1496310] passed: 0
 stress-ng: info:  [1496310] failed: 0
 stress-ng: info:  [1496310] skipped: 1: vm (1)
 stress-ng: info:  [1496310] successful run completed in 10.04s
By setting vm.overcommit_memory=1, Linux will allocate memory in a more relaxed manner, assuming an infinite amount of memory is available.
Conclusion ZRAM provides disks that allow for very fast I/O, and compression allows for a significant amount of memory savings. ZRAM is not restricted to just swap usage; it can be used as a normal block device with different file systems. Using ZRAM as swap is beneficial because, unlike disk-based swap, it is faster, and compression ensures that we use a smaller amount of RAM itself as swap space. Additionally, adjusting the memory overcommit settings can be beneficial for scenarios that require launching memory-intensive applications. Note: When running stress tests or allocating excessive memory, be cautious about the actual memory capacity of your system to prevent out-of-memory (OOM) situations. Feel free to explore the capabilities of ZRAM and optimize your system's memory management. Happy computing!

10 June 2023

Andrew Cater: 202306101949 - Release of install media - scripts running now

People are working quietly, cross-checking, reading back steps and running individual steps - we're really almost there for the install media.Just had a friendly, humorous meal out by the barbeque in Sledge's garden. It's been quite a long day but we're just finished.All this and then we'll probably have the first point release for Bookworm 12.1 in about a month. That will contain some few fixes which came in at the last minute and any other issues we've found today.BOOKWORM IS HERE!!

Next.

Previous.